[V1][Draft] Jump-forward decoding #15490

aarnphm · 2025-03-25T18:40:59Z

This PR aims to add support for jump-forward decoding by introducing the following API:

For Grammar class:
- jump_forward_string: returns the longest jf-able string from the FSM
- find_token_divergence: returns the index of shared outputs ids between (prev_output) and (prev_output+jf_string) and handle FSM accordingly

--wip--

Signed-off-by: Aaron Pham [email protected]

github-actions · 2025-03-25T18:41:09Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

aarnphm · 2025-03-26T01:34:30Z

Will ping once it is ready, discussing with Yixin atm to clear up some confusion

mergify · 2025-04-01T08:21:13Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Aaron Pham <[email protected]>

mergify · 2025-04-17T19:04:44Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aarnphm.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify bot added the v1 label Mar 25, 2025

WoosukKwon assigned russellb and WoosukKwon Mar 25, 2025

mergify bot added tpu Related to Google TPUs and removed tpu Related to Google TPUs labels Mar 27, 2025

mergify bot added the needs-rebase label Apr 1, 2025

mergify bot added tpu Related to Google TPUs and removed tpu Related to Google TPUs labels Apr 9, 2025

aarnphm force-pushed the feat/jump-forward-structured-outputs branch from a1048e6 to 236830d Compare April 13, 2025 02:01

mergify bot removed the needs-rebase label Apr 13, 2025

aarnphm added 2 commits April 14, 2025 15:43

chore: migrate tokenizer init to manager only

81aadb6

Signed-off-by: Aaron Pham <[email protected]>

--wip--

a97b172

Signed-off-by: Aaron Pham <[email protected]>

aarnphm force-pushed the feat/jump-forward-structured-outputs branch from 236830d to a97b172 Compare April 17, 2025 19:04

mergify bot added the needs-rebase label Apr 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1][Draft] Jump-forward decoding #15490

[V1][Draft] Jump-forward decoding #15490

aarnphm commented Mar 25, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Mar 25, 2025

aarnphm commented Mar 26, 2025

mergify bot commented Apr 1, 2025

mergify bot commented Apr 17, 2025

[V1][Draft] Jump-forward decoding #15490

Are you sure you want to change the base?

[V1][Draft] Jump-forward decoding #15490

Conversation

aarnphm commented Mar 25, 2025 • edited by github-actions bot Loading

github-actions bot commented Mar 25, 2025

aarnphm commented Mar 26, 2025

mergify bot commented Apr 1, 2025

mergify bot commented Apr 17, 2025

aarnphm commented Mar 25, 2025 •

edited by github-actions bot

Loading